The Tunisian_MSA corpus was originally collected to train acoustic models for pronunciation modeling in Arabic language learning applications.
<p>
  The data collection took place near Tunis the capital of the Republic of Tunisia in 2003. 
<p>
  The Tunisian_MSA  corpus is divided into recited  and prompted speech  subcorpora.
  The  recited speech is stored under the recordings directory.
  The prompted speech is stored under the answers directory.
Each of the 118 informants contributed to both subcorpora by reciting sentences and providing answers to prompted questions. 
The Tunisian_MSA corpus  has   11.2 hours of speech.
<p>
  A small corpus was collected in 2017 for testing.
  It consists of speech from 4 speakers, 3 male Libyans and 1 female from Tunisia.
  

